Rigid geometry solves “curse of dimensionality” effects in clustering methods: An application to omics data
نویسنده
چکیده
The quality of samples preserved long term at ultralow temperatures has not been adequately studied. To improve our understanding, we need a strategy to analyze protein degradation and metabolism at subfreezing temperatures. To do this, we obtained liquid chromatography-mass spectrometry (LC/MS) data of calculated protein signal intensities in HEK-293 cells. Our first attempt at directly clustering the values failed, most likely due to the so-called "curse of dimensionality". The clusters were not reproducible, and the outputs differed with different methods. By utilizing rigid geometry with a prime ideal I-adic (p-adic) metric, however, we rearranged the sample clusters into a meaningful and reproducible order, and the results were the same with each of the different clustering methods tested. Furthermore, we have also succeeded in application of this method to expression array data in similar situations. Thus, we eliminated the "curse of dimensionality" from the data set, at least in clustering methods. It is possible that our approach determines a characteristic value of systems that follow a Boltzmann distribution.
منابع مشابه
A Coupled Rigid-viscoplastic Numerical Modeling for Evaluating Effects of Shoulder Geometry on Friction Stir-welded Aluminum Alloys
Shoulder geometry of tool plays an important role in friction-stir welding because it controls thermal interactions and heat generation. This work is proposed and developed a coupled rigid-viscoplastic numerical modeling based on computational fluid dynamics and finite element calculations aiming to understand these interactions. Model solves mass conservation, momentum, and energy equations in...
متن کاملAn Information Geometric Framework for Dimensionality Reduction
This report concerns the problem of dimensionality reduction through information geometric methods on statistical manifolds. While there has been considerable work recently presented regarding dimensionality reduction for the purposes of learning tasks such as classification, clustering, and visualization, these methods have focused primarily on Riemannian manifolds in Euclidean space. While su...
متن کاملA H-K Clustering Algorithm For High Dimensional Data Using Ensemble Learning
Advances made to the traditional clustering algorithms solves the various problems such as curse of dimensionality and sparsity of data for multiple attributes. The traditional H-K clustering algorithm can solve the randomness and apriority of the initial centers of K-means clustering algorithm. But when we apply it to high dimensional data it causes the dimensional disaster problem due to high...
متن کاملA Multi Linear Discriminant Analysis Method Using a Subtraction Criteria
Linear dimension reduction has been used in different application such as image processing and pattern recognition. All these data folds the original data to vectors and project them to an small dimensions. But in some applications such we may face with data that are not vectors such as image data. Folding the multidimensional data to vectors causes curse of dimensionality and mixed the differe...
متن کاملUsing k-Nearest Neighbor and Feature Selection as an Improvement to Hierarchical Clustering
Clustering of data is a difficult problem that is related to various fields and applications. Challenge is greater, as input space dimensions become larger and feature scales are different from each other. Hierarchical clustering methods are more flexible than their partitioning counterparts, as they do not need the number of clusters as input. Still, plain hierarchical clustering does not prov...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 12 شماره
صفحات -
تاریخ انتشار 2017